Generalized Graphical Abstractions for Statistical Machine Translation
نویسندگان
چکیده
We introduce a novel framework for the expression, rapid-prototyping, and evaluation of statistical machine-translation (MT) systems using graphical models. The framework extends dynamic Bayesian networks with multiple connected different-length streams, switching variable existence and dependence mechanisms, and constraint factors. We have implemented a new general-purpose MT training/decoding system in this framework, and have tested this on a variety of existing MT models (including the 4 IBM models), and some novel ones as well, all using Europarl as a test corpus. We describe the semantics of our representation, and present preliminary evaluations, showing that it is possible to prototype novel MT ideas in a short amount of time.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملA Post-editing Interface for Immediate Adaptation in Statistical Machine Translation
Adaptive machine translation (MT) systems are a promising approach for improving the effectiveness of computer-aided translation (CAT) environments. There is, however, virtually only theoretical work that examines how such a system could be implemented. We present an open source post-editing interface for adaptive statistical MT, which has in-depth monitoring capabilities and excellent expandab...
متن کاملTowards a User-Friendly Webservice Architecture for Statistical Machine Translation in the PANACEA project∗
This paper presents a webservice architecture for Statistical Machine Translation aimed at non-technical users. A workflow editor allows a user to combine different webservices using a graphical user interface. In the current state of this project, the webservices have been implemented for a range of sentential and sub-sentential aligners. The advantage of a common interface and a common data f...
متن کاملA Generalized Reordering Model for Phrase-Based Statistical Machine Translation
Phrase-based translation models are widely studied in statistical machine translation (SMT). However, the existing phrase-based translation models either can not deal with non-contiguous phrases or reorder phrases only by the rules without an effective reordering model. In this paper, we propose a generalized reordering model (GREM) for phrase-based statistical machine translation, which is not...
متن کاملMachine Translation Using Probabilistic Synchronous Dependency Insertion Grammars
Syntax-based statistical machine translation (MT) aims at applying statistical models to structured data. In this paper, we present a syntax-based statistical machine translation system based on a probabilistic synchronous dependency insertion grammar. Synchronous dependency insertion grammars are a version of synchronous grammars defined on dependency trees. We first introduce our approach to ...
متن کامل